NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Making AI Less 'Thirsty'

https://doi.org/10.1145/3724499

Li, Pengfei; Yang, Jianyi; Islam, Mohammad A; Ren, Shaolei (July 2025, Communications of the ACM)

Uncovering and addressing the secret water footprint of AI models
more » « less
Full Text Available
Holistic Design towards Resource-Stringent Binary Vector Symbolic Architecture

Duan, Shijin; Narkthong, Nuntipat; Luo, Yukui; Ren, Shaolei; Xu, Xiaolin (June 2025, Design Automation Conference (DAC))

Classification tasks on ultra-lightweight devices demand devices that are resource-constrained and deliver swift responses. Binary Vector Symbolic Architecture (VSA) is a promising approach due to its minimal memory requirements and fast execution times compared to traditional machine learning (ML) methods. Nonetheless, binary VSA's practicality is limited by its inferior inference performance and a design that prioritizes algorithmic over hardware optimization. This paper introduces UniVSA, a co-optimized binary VSA framework for both algorithm and hardware. UniVSA not only significantly enhances inference accuracy beyond current state-of-the-art binary VSA models but also reduces memory footprints. It incorporates novel, lightweight modules and design flow tailored for optimal hardware performance. Experimental results show that UniVSA surpasses traditional ML methods in terms of performance on resource-limited devices, achieving smaller memory usage, lower latency, reduced resource demand, and decreased power consumption.
more » « less
Full Text Available
Toward Environmentally Equitable AI

https://doi.org/10.1145/3725980

Hajiesmaili, Mohammad; Ren, Shaolei; Sitaraman, Ramesh; Wierman, Adam (July 2025, Communications of the ACM)

The environmental cost of AI is often disproportionately higher in certain regions than in others.
more » « less
Full Text Available
Hardware-Sensitive Fairness in Heterogeneous Federated Learning

https://doi.org/10.1145/3703627

Talukder, Zahidur; Lu, Bingqian; Ren, Shaolei; Islam, Mohammad Atiqul (March 2025, ACM Transactions on Modeling and Performance Evaluation of Computing Systems)

Federated learning (FL) is a promising technique for decentralized privacy-preserving Machine Learning (ML) with a diverse pool of participating devices with varying device capabilities. However, existing approaches to handle such heterogeneous environments do not consider “fairness” in model aggregation, resulting in significant performance variation among devices. Meanwhile, prior works on FL fairness remain hardware-oblivious and cannot be applied directly without severe performance penalties. To address this issue, we propose a novel hardware-sensitive FL method called FairHetero that promotes fairness among heterogeneous federated clients. Our approach offers tunable fairness within a group of devices with the same ML architecture as well as across different groups with heterogeneous models.
more » « less
Full Text Available
Learning-Augmented Online Control for Decarbonizing Water Infrastructures

https://doi.org/10.1145/3679240.3734595

Yang, Jianyi; Li, Pengfei; Li, Tongxin; Wierman, Adam; Ren, Shaolei (June 2025, ACM)

Full Text Available
Learning for Sustainable Online Scheduling with Competitive Fairness Guarantees

https://doi.org/10.1145/3679240.3734615

Li, Pengfei; Christianson, Nicolas; Yang, Jianyi; Wierman, Adam; Ren, Shaolei (June 2025, ACM e-Energy 2025)

Full Text Available
Towards Vector Optimization on Low-Dimensional Vector Symbolic Architecture

Duan, Shijin; Liu, Yejia; Liu, Gaowen; Kompella, Ramana Rao; Ren, Shaolei; Xu, Xiaolin (March 2025, The Conference on Parsimony and Learning (CPAL))

Vector Symbolic Architecture (VSA) is emerging in machine learning due to its efficiency, but they are hindered by issues of hyperdimensionality and accuracy. As a promising mitigation, the Low-Dimensional Computing (LDC) method significantly reduces the vector dimension by 100 times while maintaining accuracy, by employing a gradient-based optimization. Despite its potential, LDC optimization for VSA is still underexplored. Our investigation into vector updates underscores the importance of stable, adaptive dynamics in LDC training. We also reveal the overlooked yet critical roles of batch normalization (BN) and knowledge distillation (KD) in standard approaches. Besides the accuracy boost, BN does not add computational overhead during inference, and KD significantly enhances inference confidence. Through extensive experiments and ablation studies across multiple benchmarks, we provide a thorough evaluation of our approach and extend the interpretability of binary neural network optimization similar to LDC, previously unaddressed in BNN literature.
more » « less
Full Text Available
Learning-Augmented Decentralized Online Convex Optimization in Networks

https://doi.org/10.1145/3700420

Li, Pengfei; Yang, Jianyi; Wierman, Adam; Ren, Shaolei (December 2024, Proceedings of the ACM on Measurement and Analysis of Computing Systems)

This paper studies learning-augmented decentralized online convex optimization in a networked multi-agent system, a challenging setting that has remained under-explored. We first consider a linear learning-augmented decentralized online algorithm (LADO-Lin) that combines a machine learning (ML) policy with a baseline expert policy in a linear manner. We show that, while LADO-Lin can exploit the potential of ML predictions to improve the average cost performance, it cannot have guaranteed worst-case performance. To address this limitation, we propose a novel online algorithm (LADO) that adaptively combines the ML policy and expert policy to safeguard the ML predictions to achieve strong competitiveness guarantees. We also prove the average cost bound for LADO, revealing the tradeoff between average performance and worst-case robustness and demonstrating the advantage of training the ML policy by explicitly considering the robustness requirement. Finally, we run an experiment on decentralized battery management. Our results highlight the potential of ML augmentation to improve the average performance as well as the guaranteed worst-case performance of LADO.
more » « less
Full Text Available
Bileve: Securing Text Provenance in Large Language Models Against Spoofing with Bi-level Signature

Zhou, Tong; Zhao, Xuandong; Xu, Xiaolin; Ren, Shaolei (November 2024, NeurIPS)

Text watermarks for large language models (LLMs) have been commonly used to identify the origins of machine-generated content, which is promising for assessing liability when combating deepfake or harmful content. While existing watermarking techniques typically prioritize robustness against removal attacks, unfortunately, they are vulnerable to spoofing attacks: malicious actors can subtly alter the meanings of LLM-generated responses or even forge harmful content, potentially misattributing blame to the LLM developer. To overcome this, we introduce a bi-level signature scheme, Bileve, which embeds fine-grained signature bits for integrity checks (mitigating spoofing attacks) as well as a coarse-grained signal to trace text sources when the signature is invalid (enhancing detectability) via a novel rank-based sampling strategy. Compared to conventional watermark detectors that only output binary results, Bileve can differentiate 5 scenarios during detection, reliably tracing text provenance and regulating LLMs. The experiments conducted on OPT-1.3B and LLaMA-7B demonstrate the effectiveness of Bileve in defeating spoofing attacks with enhanced detectability.
more » « less
Full Text Available
Reconciling the contrasting narratives on the environmental impact of large language models

https://doi.org/10.1038/s41598-024-76682-6

Ren, Shaolei; Tomlinson, Bill; Black, Rebecca W; Torrance, Andrew W (December 2024, Scientific Reports)

Abstract The recent proliferation of large language models (LLMs) has led to divergent narratives about their environmental impacts. Some studies highlight the substantial carbon footprint of training and using LLMs, while others argue that LLMs can lead to more sustainable alternatives to current practices. We reconcile these narratives by presenting a comparative assessment of the environmental impact of LLMs vs. human labor, examining their relative efficiency across energy consumption, carbon emissions, water usage, and cost. Our findings reveal that, while LLMs have substantial environmental impacts, their relative impacts can be dramatically lower than human labor in the U.S. for the same output, with human-to-LLM ratios ranging from 40 to 150 for a typical LLM (Llama-3-70B) and from 1200 to 4400 for a lightweight LLM (Gemma-2B-it). While the human-to-LLM ratios are smaller with regard to human labor in India, these ratios are still between 3.4 and 16 for a typical LLM and between 130 and 1100 for a lightweight LLM. Despite the potential benefit of switching from humans to LLMs, economic factors may cause widespread adoption to lead to a new combination of human and LLM-driven work, rather than a simple substitution. Moreover, the growing size of LLMs may substantially increase their energy consumption and lower the human-to-LLM ratios, highlighting the need for further research to ensure the sustainability and efficiency of LLMs.
more » « less
Full Text Available

« Prev Next »

Search for: All records